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Abstract: In [] wc defined Inf-Datalog and characterized the fragments of Monadic inf-Datalog that have the 
same expressive power as Modal Logic (resp. CTL, alternation-free Modal yu-calculus and Modal /ii-calculus) . We 
study here the time and space complexity of evaluation of Monadic inf-Datalog programs on finite models. We 
deduce a new unified proof that model checking has 

1. linear data and program complexities (both in time and space) for CTL and alternation- free Modal /x-calculus, 

and 

2. linear-space (data and program) complexities, linear-time program complexity and polynomial-time data 
complexity for L/n^ (Modal //-calculus with fixed alternation-depth at most k). 



1 Introduction 

The model checking problem for a logic A consists in verifying whether a formula (j) of A is satisfied in a 
given structure K.. In computer-aided verification, ^ is a temporal logic i.e. a modal logic used for the 
description of the temporal ordering of events and A!I is a (finite) Kripke structure i. e. a graph equipped 
with a labelling function associating with each node s the finite set of propositional variables of A that 
are true at node s. 

Our approach to temporal logic model checking is based on the close relationship between model checking 

and Datalog query evaluation: a Kripke structure /C can be seen as a relational database and a formula 
(j) can be thought of as a Datalog query Q. In this context, the model checking problem for in /C 
corresponds to the evaluation of Q on input database /C. 

In [] we introduced the language inf-Datalog, which extends usual least fixpoint semantics of Datalog with 
greatest fixpoint semantics, we gave translations from various temporal logics (CTL, ETL, alternation- 
free Modal /x-calculus, and Modal /x-calculus, by increasing order of expressive power []) into Monadic 
inf-Datalog and we also gave translations from fragments of Monadic inf-Datalog into these logics. In 
this paper we give upper bounds for evaluating Monadic inf-Datalog queries: we describe an algorithm 
evaluating Monadic inf-Datalog queries and analyze its complexity with respect to the size of the database 
(data complexity) and its complexity with respect to the size of the program {program complexity). The 
data complexity is polynomial-time and becomes linear when the program is stratified (with respect to 
least and greatest fixed points nesting): from this we derive a unified proof of the (known) linear- time data 
complexity of the model checking problem for CTL, ETL and the alternation-free /z-calculus. The program 
complexity of our algorithm is linear-time and linear-space, and the data complexity is linear-space too. 
Using then our translations in [] between the temporal logic paradigm and the database paradigm, 
we can deduce upper bounds for the complexity of the model checking problem in the aforementioned 
temporal logics. This is worthwhile especially for the space complexity which is less studied than the 
time complexity. 
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2 Definitions 

The basic definitions about Datalog can be found in [,], and the basic definitions about the /i-calculus 
can be found in [,]. We proceed directly with the definition of inf-Datalog. 

Definition 1 An inf-Datalog program is a Datalog program where some IDB predicates are tagged 
with an overline indicating that they must be computed as greatest fixed points, and where in addition, 
for each set of mutually recursive IDB predicates including both tagged and untagged IDB predicates, 
the order of evaluation of the IDB predicates in the set is specified. 

An inf-Datalog program is said to be monadic if all the predicates occurring in the heads of the rules 
have arity at most one. An inf-Datalog program is said to be stratified if no tagged IDB predicate is 
mutually recursive with an untagged IDB predicate. 

Our approach allows us to define some recursive predicates without initialization rules (non-recursive rules 
with this predicate in the head); such recursive predicates must be tagged. This approach is necessary 
in order to be able to express properties such as fairness (something must happen infinitely often). 

The above notion of stratification is the natural counterpart (with respect to greatest fixed points) of 
the well-known stratification with respect to negation. We give an example of a stratified inf-Datalog 
program. 

Example 2 Consider as database an infinite full binary tree, with two EDB predicates Suco and Sua 
denoting respectively the first successor and the second successor, and a unary EDB predicate p (which 
is meant to state some property of the nodes of the tree). The program P below, has as IDB predicates 

6 (computed as a greatest fixed point) and ip (computed as a least fixed point) 
( 9{x) < — p(x),Suco{x,y),Suci{x,z),9{y),9{z) 



The IDB predicate 9 in this program implements the modality AGp on the infinite full binary tree, and 
the IDB predicate (p implements the modality AFAGp: AGp means that p is always true on all paths, 
and AFAGp means that, on every path wo will eventually (after a finite number of steps) reach a state 
wherefrom p is always true on all paths. Gp is expressed by the GTL path formula _LUp and AFAGp 



3 Complexity of Monadic inf-Datalog 

Theorem 2 Let P be a stratified Monadic inf-Datalog program having I IDB symbols, and D a 
relational database having n elements in its domain, then the set of all / queries defined by P (of the 
form {P, (f ), where (p is an IDB of P) can be evaluated in time n x I and space n x I. 

Proof By induction on the number p of strata. Assume P has a single stratum, and, e.g. all IDBs are 
untagged, hence computed as least fixed points. Let ipi,. . . ,ipi be the IDBs, then the answer /i, . . . , // 
to the set of queries (P, Lpi), . . . , [P, (pj) defined by P is equal to sup^^jj^ Tp{^, • • • , 0) and, because D has 
n objects only, this least upper bound is obtained after at most nx I steps. Same proof if all IDBs are 
tagged (computed as greatest fixed points). 

The case where P has p strata is similar: since the IDBs are computed in the order of the strata, assuming 
stratum j has Ij IDBs, the queries it defines will be computed in time n x Ij, hence for the whole of P 
the complexity will be n x Ij = nx I. The space complexity is clear too because we have at any time 
at most / IDBs true of at most n data objects. 

This bound is tight as shown in the next ?? . □ 

Theorem ?? subsumes a result of [], where it is shown that the data complexity of Monadic Datalog is 
linear-time; we prove that both the data complexity and the program complexity of stratified Monadic inf- 
Datalog are linear-time and linear-space: hence adding greatest fixed points in a stratified way increases 
the expressive power of Monadic Datalog without increasing its evaluation complexity. As a consequence 
of ?? we get a new unified proof of the following result. 
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Corollary 3 The model checking problem for CTL, ETL and alternation-free Modal ^i-calculus can he 
solved in time and space 0{\M\ x |/|), where \M\ (resp. \f\) is the size of the model (resp. the formula); 
hence both the data and program complexities are linear in time and space. 

Proof. Indeed we give in [] a translation from CTL, ETL and alternation-free Modal /i-calculus into 
Monadic stratified modal inf-Datalog such that the number of IDBs in the program is less than the size 
of the formula. □ 

A unified proof of the time-linearity wrt. the size of the model is also given in [], and other proofs are 
given in [,]. 

P'r p q 



1 2 3 

Figure 1 A data structure of size 3 

Example 4 Consider the structure given in ?? , where suc{l,2),suc{2,3),p{l),p{2),q{3),r{l) hold and 
the Monadic Datalog program: 

(p{x) < — q{x) 

(p{x) i — p{x) , suc{x, y) , (p{x) 

tp{x) < (p{x),r{x) 

ip{y) < i;{x),suc{x,y) 

Then, we need 6 steps to compute the queries defined by the program: 

(fio = ^,(fii = {3}, (fi2 = {2, 3}, ifis = {1, 2, 3} = (p4 = (fis = ifie 

i'O = ^Pl = ^2 = = 0, ^4 = {1}, VT, = {1, 2}, 7/>6 = {1, 2, 3}. 

We now turn to Monadic inf-Datalog programs with alternations. 

Theorem 4 Let P be a program with k — 1 alternations of least fixed points and greatest fixed points 
(k fixed). Assume P has I mutually recursive IDBs. Let D be a relational database having n elements. 
Then the set of all queries of the form (P, ip), wiiere if is an IDB of P, can be computed in time 

0({n + 1)'^ X /) and space 0{n x /). 

Proof. Program P has k — 1 alternations of least fixed points and greatest fixed points, which means that 
there exist IDBs (pi,Tp2, . . . , ipk-i,'^, computed in the order: first c^i, then . . . , and last Tpk. For 
simplicity, we first assume that I = k, k even, then P = P^ has the following form: 



Pk < 



Pi 



k-l \ 



' Vk{x) 

, ¥k{x) 
P' 

^k-l 



Vk-l{x) 



<Pk-i{x) 



P2 < 



P' 



Pi 



' ¥'2(x) 

Mx) 



The idea of the algorithm is obtained by adapting an algorithm given in [] for evaluating boolean /x- 
calculus formulas and proceeds as follows. Let /i, . . . , //, be the queries defined by . . . ,lpk. In order to 
compute fk we must compute inf^ Tp, (T), where T is true of every element in the data domain, and fk 

will be reached after at most n steps (because the domain has n elements). However, since depends 
on (fk-i, we must prealably compute fh-i[T /Jpk], which denotes /fe_i in which T has been substituted 
for the parameter Tpj^: this implies computing supj Tp, [T/^](0), which is again reached after at most 

k — l 

n steps, etc. The algorithm is described in ?? . 
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algorithm! 

VAR ji,...,jk- indices; 
fk := T; 

FDR jk = i TD n + 1 DO 

fk-1 := 0; 

FDR jk-1 = 1 TD n+1 DD 

Ik-2 := T; 
/2 := T; 

FDR j2 = 1 TD n + 1 DD 

/i := 0; 

FDR ji = 1 TD n DD 

fl-= Tp^ (/l, /2, ■■■,fk)\ 
ENDFDR (ji) 

/2 := ill, f2, fk); 

ENDFDR (^2) 

fk-1 •= Tpi^^{fi, f2, . ■ . , fk-1, fk); 
ENDFDR (jk-l) 

fk ■= Tpi^{fi, f2, . ■ . , fk-1, fk); 
ENDFDR (jk) 

Figure 2 Algorithml 

Notice that in the fc — 1 first nested loops the indices have to go from 1 to indeed each individual fj is 

computed in at most n steps, but then we have to substitute the value just computed for fj in /i, . . . , fj-i 
whence the need for one more round of iterations. At the end fi, . . . , fk contain the answers to the queries 

defined by ipi,. . . ,Tp^. The complexity of the algorithm is (n+ 1) + (n+ 1)^H h (n+ 1)''"^ +n(n+ 1)*""-^ 

which is 0((n + 1)''). 

The generalization to the case when P has I mutually recursive IDBs, 7 > fc, is straightforward: let the 
IDBs of P be for instance $ = $1 U $3 U • • • U $fe. AU the IDBs in (rcsp. $7 '^'"c untagged (rcsp. 
tagged). The order and type of evaluation are as follows: first all IDBs of $1 are computed as least fixed 
points, then all IDBs of $2 are computed as greatest fixed points, . . . , and finally all IDBs of $fc are 
computed as greatest fixed points. Assume i>i has m,i IDBs, for i = 1, . . . , fc. 

Then it suffices to substitute for instruction: /j := Tp^'(/i, /2, . ■ . ,fi,. . ■ , fk) the set of rrii instructions: 

fi,l ■= Tpi^i{fi,f2, . . . , fi, . . . , fi) 

fi,mi '■= Tp^^rrii {fl, f2, ■ ■ ■ , fi, ■ ■ ■ , fl) 

where Tp> i{fi, /2, . . . , fi-i, ■ ■ ■ , fi) denotes the set of immediate consequences which can be deduced 
using the rules of P/ with head (pij. Now the complexity of the algorithm becomes: 

(n+1) xm/c + (n+l)^ xmfe_iH \-{n+l)''~^ xm2+n(n+l)'=~-^ xmi which is an 0((n+l)'° xmax{mi/i = 

l,...,fc}) < 0((n + l)'= X 7). □ 

We can restate ?? as: algorithml computes the answers to queries defined by Monadic inf-Datalog pro- 
grams with linear-space (data and program) complexities, linear-time program complexity, and polynomial- 
time data complexity. 




Figure 3 Another structure of size 3 
Example 5 Consider the structure given in ?? , where suci{l, 1), s«co(l, 2), swco(2, 3),p(l),p(2),p(3) hold 
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and the program P below (where I — k — 2): 

{ip'^{x) < e^{x),Suco{x,y),Suci{x,z),ip^{y),ip'^{z) 
e\x)< — Suc^{x,y),e\y) for i = 0, 1 
^^(a::) < — p{x),Suc^{x,y),Lp^(y) for i = 0, 1 

Then algorithml will compute: 1. for /a = T, /i = 0, /i = {1,2}, and /2 = {1,2}; then, 2. for /a = {1,2}, 
/i = 0, /i = {1}, and /2 = {1}; then, 3. for /2 = {1}, /i = 0, /i = {1}, and /2 = 0; a last round will give 4. 
for /2 = 0, /i — 0. (P is the translation of the temporal logic formula: ip — E(_F°°p A A o F°°p) expressing 
that there exists a path on which p holds infinitely often and moreover, on all successors of the first state 
of that path, again p holds infinitely often.) 

Corollary 5 1. The set of queries defined by Monadic inf-Datalog programs can be computed in time 
polynomial in the size of the data structure, exponential in the number of alternations of least fixed points 
and greatest fixed points, and linear in the number of IDBs. The space complexity is linear in n x I (n 
is the size of the structure and I the number of IDBs). 

2. The model checking problem for the Modal fi-calculus can be solved in time polynomial in the size of 
the model and exponential in the number of syntactic alternations of the formula. The space complexity 
is linear in \M\ x |/| (\M\ is the size of the model and |/| the size of the formula). 

Proof. 2 follows from the fact that in [] we gave a translation from modal /z-calculus formulas into Monadic 
(in fact modal) inf-Datalog programs, such that the number k of alternations in the program is equal to 
the number of syntactic alternations [] of the formula and the number / of IDBs is less than the size of 
the formula. 1 is a restatement of ?? . □ 



4 Conclusion 

We gave a (linear-) polynomial-time algorithm computing the answers to the queries defined by a (strati- 
fied) Monadic inf-Datalog program. The time complexity of this algorithm is 0{n''''^^) where n is the size 
of the database and k the number of alternations. We believe that this bound could be slightly improved: 
indeed there are algorithms for model checking formulas of Lfi^ (which is equivalent to a fragment of 
Monadic inf-Datalog), with upper bounds 0((n x \f\)'') [,] and 0((n x l/j)^^'^''^) [] (compared to our 
bound 0((n + 1)'''+^ x |/|) ); however the space complexity of the improved algorithm in [] becomes 
exponential whilst the space complexity of the naive algorithms is polynomial [] . 
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